{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Page Rank\n",
    "\n",
    "PageRank (PR) is an algorithm used by Google Search to rank web pages in their search engine results. PageRank was named after Larry Page, one of the founders of Google. PageRank is a way of measuring the importance of website pages.\n",
    "\n",
    "## Algorithm\n",
    "\n",
    "The PageRank algorithm outputs a probability distribution used to represent the likelihood that a person randomly clicking on links will arrive at any particular page.\n",
    "\n",
    "## Example\n",
    "\n",
    "This article is based on the following YouTube Video.\n",
    "\n",
    "[PageRank Algorithm - Example(YouTube)](https://www.youtube.com/watch?v=P8Kt6Abq_rM)\n",
    "\n",
    "## Explore\n",
    "\n",
    "Suppose we have the following graph. Each note represents a page.\n",
    "\n",
    "![image](page_rank/start.png)\n",
    "\n",
    "### Iteration 0\n",
    "\n",
    "At the very beginning(Iteration 0) the probability that a user randomly click of the pages is 1/4.\n",
    "\n",
    "### Iteration 1\n",
    "\n",
    "At Iteration 1, the user is to click on one of the links on the page and go to another page.\n",
    "\n",
    "**For Page A**\n",
    "\n",
    "![image](page_rank/iteration_1_a.jpg)\n",
    "\n",
    "The only visit is from Page C, and C points to A, B and D, so the posibility that the user goes from C to A is 1/3. If we consider the posibility of the user starts at C (aka Iteration 0, 1/4), the overall posibility is 1/3 mutiply 1/4 = 1/12=0.08333333333333333.\n",
    "\n",
    "**For Page B**, let's look at the following picture.\n",
    "\n",
    "![image](page_rank/iteration_1_b.jpg)\n",
    "\n",
    "The visit may comes from A or C. If it comes from A, the posibilty from A to B is 1/2, because A also points to C. Or:\n",
    "\n",
    "P(B|A) = 1/2\n",
    "\n",
    "And because in Iteration 0, P(A) = 1/4, So the posibility is P(B|A) * P(A) = 1/2 * 1/4 = 1/8.\n",
    "\n",
    "If it comes from C, the equation is P(B|C) * P(C) = 1/3 * 1/4 = 1/12.\n",
    "\n",
    "We add them up, so in Iteration 1, P(B|I1) = P(B|A) * P(A) + P(B|C) * P(C) = 1/8 + 1/12 = 2.5/12 = 0.20833333333333334.\n",
    "\n",
    "Similarily, we have the following equation for **C**:\n",
    "\n",
    "![image](page_rank/iteration_1_c.jpg)\n",
    "\n",
    "P(C|I1) = P(C|A) * P(A) + P(C|D) * P(D) = 1/2 * 1/4 + 1 * 1/4 = 1/8+1/4=4.5/12=0.375\n",
    "\n",
    "For **D**\n",
    "\n",
    "P(D|I1) = P(D|B) * P(B) + P(D|C) * P(C) = 1 * 1/4 + 1/3 * 1/4 = 1/4+1/12=4/12=0.3333333333333333\n",
    "\n",
    "So Iteration 1 result is :\n",
    "\n",
    "[0.08333333333333333, 0.20833333333333334, 0.375, 0.3333333333333333]\n",
    "\n",
    "### Iteration 2\n",
    "\n",
    "\n",
    "\n",
    "For Iteration 2 (I2), we calculate the results based on Iteration 1 (I1).\n",
    "\n",
    "The posibility of \n",
    "\n",
    "**visiting A**\n",
    "\n",
    "![image](page_rank/iteration_2_a.jpg)\n",
    "\n",
    "P(A|I2) = P(C|A) * P(C|I1) = 1/3 * 4.5/12 = 1.5/12\n",
    "\n",
    "**Visiting B**\n",
    "\n",
    "P(B|I2) = P(B|A) * P(A|I1) + P(B|C) * P(C|I1) = 1/2 * 1/12 + 1/3 * 4.5/12= 2/12\n",
    "\n",
    "**Visiting C**\n",
    "\n",
    "P(C|I2) = P(C|A) * P(A|I1) + P(C|D) * P(D|I1)  = 1/2 * 1/12 = 4.5/12\n",
    "\n",
    "\n",
    "**Visiting D**\n",
    "\n",
    "P(D|I2) = P(D|B) * P(B|I1) + P(D|C) * P(C|I1)  = 1 * 2.5/12 + 1/3 * 4.5/12= 4/12\n",
    "\n",
    "So Iteration 2 result is :\n",
    "\n",
    "[0.125, 0.16666666666666666, 0.375, 0.3333333333333333]\n",
    "\n",
    "the final page rank is [1, 2, 4, 3]\n",
    "\n",
    "![image](page_rank/final.jpg)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "n_iterations=3\n",
    "n_nodes=4"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0., 1., 1., 0.],\n",
       "       [0., 0., 0., 1.],\n",
       "       [1., 1., 0., 1.],\n",
       "       [0., 0., 1., 0.]])"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#graph\n",
    "graph = np.zeros((n_nodes, n_nodes))\n",
    "#direction[start_node,end_node]=1\n",
    "graph[0,1]=1\n",
    "graph[0,2]=1\n",
    "graph[1,3]=1\n",
    "graph[2,0]=1\n",
    "graph[2,1]=1\n",
    "graph[2,3]=1\n",
    "graph[3,2]=1\n",
    "graph"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Page rank in Iteration 0\n",
      "[0.25 0.25 0.25 0.25]\n"
     ]
    }
   ],
   "source": [
    "#page rank matrix\n",
    "pr_matrix = np.zeros((n_iterations, n_nodes))\n",
    "#iteration 0\n",
    "pr_matrix[0] = [1/n_nodes] * n_nodes\n",
    "print('Page rank in Iteration 0')\n",
    "print(pr_matrix[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Page rank in Iteration 1\n",
      "[0.08333333 0.20833333 0.375      0.33333333]\n",
      "Page rank in Iteration 2\n",
      "[0.125      0.16666667 0.375      0.33333333]\n"
     ]
    }
   ],
   "source": [
    "#iteration 1,2\n",
    "for i_iteration in [1,2]:\n",
    "    print(f'Page rank in Iteration {i_iteration}')\n",
    "    for node in range(n_nodes):\n",
    "        pr=0\n",
    "        for previous_node in range(n_nodes):\n",
    "            if graph[previous_node, node]==1:\n",
    "                pr+=pr_matrix[i_iteration-1,previous_node]/graph[previous_node, :].sum()\n",
    "        pr_matrix[i_iteration, node] = pr\n",
    "    print(pr_matrix[i_iteration])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
